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Adaptive beamfoimer, sidelobe canceller, handsfiee speech ccmmimiication device, voice 
control unit, audio based tracking device, and method of sidelobe canceling 



10 



The invention relates to an adaptive beamfoimer and a sidelobe canceller 
comprising such an adaptive beamformer. 

The invention also relates to a handsfree speech communication device, voice 
control unit and tracking device for tracking an audio producing object, comprising such an 
adaptive beamformer or sidelobe canceller. 

The invention also relates to a consumer apparatus comprising such a voice 



control unit 



The invention also relates to a m^od of adaptive beamforming or sidelobe 



canceling. 



An embodiment of a sidelobe canceller and conq)rised beamformer as 
announced in tiie first paragraph is known from tiie publication "C. Fancourt and Paira: 
The generalized sidelobe decorrelator. Proceedings of flie IEEE Woricshop on applications of 

1 5 signal processing to audio and acoustics 200 1 A sidelobe canceller is designed to lock in on 
a desired sound source, i.e. producing an output audio signal predominantiy corresponding to 
the sound from the desired sound source, while rejecting as much as possible sound from 
other sources, called noise. To realize this the sidelobe canceller comprises an adaptive 
beamformer processing signals from an array of microphones, of which beamformer filters 

20 can be optimized, so that fhey represent the inverse of the paths of the desired audio from the 
desired sound source to each of the microphones (i.e. the desired audio is modified by e.g. 
reflecting off various sur&ces and finally entering a particular microphone from diEferent 
directions). By summing the filtered signals, the beamformer effectively realizes a direction 
sensitivity pattern which has a lobe of high sensitivity in the direction of the desired sound 

25 source. E.g. for filters which are pure delajrs, the beamformer realizes a sin(x)/x pattern with 
a main lobe and side lobes. The problem with such a sensitivity pattern however is that also 
sound from other sources may be picked up. E.g. a noise source may be situated in the 
direction of one of the side lobes. To resolve this problem, the sidelobe canceller also 
comprises an adaptive noise cancellation stage. From the microphone measurements, noise 
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reference signals are calculated, by blocking the desired sound component from them, i.e. in 
the example the noise in the sidelobes is detemodned. By means of an adaptive fflter from 
these noise measurements it is estimated how much of the noise sources leaks in the lobe 
pattern, directed towards the desired sound. Finally, this noise is subtracted from what is 
picked up in the main lobe, leaving as a JSnal audio signal largely only desired sound. If a 
directivity pattern is calculated corresponding to fliis optimized sidelobe canceller, it contains 
a main lobe towards the desired sound source, and zeroes in the directions of the noise 
sources. 

There are a number of problems with the prior art sidelobe canceller and 
beamformer, leading to the feet that in practice it does not woric like it ideally should. Firstly, 
there is not necessarily a physical difference between sound from a desired sound source, e.g. 
a speaker, and sound form a noise source, e.g. sound of a motor. So instead of locking on to 
the speaker, the system may diverge towards the noise source, and have a main lobe towards 
a direction in between the desired sound source and the noise source. In the sidelobe 
canceller, this leads to the feet that the noise references contain speech or in general desired 
sound, and hence instead of canceling only noise from the sound picked up by the mainlobe, 
also part of the desired sound is cancelled. For speech this may be particulaiiy unacceptable. 
The sidelobe canceller wilfa miorophone army may in some cases evrai work worse than a 
single microphone without sidelobe canceller. Such a noise coming from a particular 
direction (e.g. a second speaker) is called correlated noise, since each of the microphones 
picks up a related sound, e.g. a delayed version. Secondly there is die problran of so called 
uncorrelated source, in which case the signals of the microphones are orthogonal. 
Uncorrelated noise can originate e.g. from the diffuse sound field (many independent sources 
such as e.g. from reverberation, or wind noise for a car), or just electronic noise in the 
microphones. This noise can also interfere with the ftmctioning of the sidelobe canceller. 
Prior art sidelobe cancellers may contain a speech detector to try to solve these problems- It is 
assumed that the desired sound source is a speaker, and the noise sources are not. The 
beamformer is only adapted if it receives spsechu tj^'pically by s majdmisation of its output 
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importantly, such speech detectors are not very robust, making such sidelobe cancellers still 
relatively bad. Good sidelobe cancellers are especially difficult to design for environments in 
which the direction of the desired sound source and/or the noise sources are changing, hence 
for which the filters may have to re-adapt during relatively short time intervals. However this 
situation is quite common, e.g. in a teleconference system which attempts to track a speaker 
moving through a room, or in a system with a person speaking to a sidelobe canceller 
incoiporated in a mobile phone, and together with the mobile phone moving through a 
variable enviroimient, such as e.g. encountered with a handsfiree car phone kit. What was 
described for a sidelobe canceller is also a problem for an adaptive beamformer associated 
with another noise removal strategy. 

It is a first object of the invention to provide an adaptive beamformer which is 
relatively robust against the influences of noises. This first object is realized in that the 

adaptive beamformer comprises: 

a filtered sum beamformer arranged to process input audio signals firom an 
array of respective microphones, and arranged to yield as an output a first audio signal 
predominantly corresponding to sound firom a desked audio source, by filtering with a first 
set of respective adaptable filters the input audio signals, the filtered sum beamformer being 
adaptive in the sense that coefficients of the first set of ad^table filters are susceptible to be 
changed by adding to at least one coefficient a difference value, obtained as a fimction of an 

adaptation step size; 

- a connection for providing a noise measurement derivable fi:om at least one 

of the input audio signals; 

- a subtracter to subtract the noise measurement firom the first audio signal to 

obtain a noise cleaned second audio signal; and 

a scaling factor determining unit, arranged to provide a scale factor evaluated 
as a first fimction, of a ratio of a first variable derived firom the first audio signal and a second 
variable derived from the noise measurement, and arranged to scale the adaptation step size 

with the scale &ctor. 

A more continuous evaluation of whether the adaptive beanaformer is locking 

on the desired sound or not is desired for a robust adaptive beamformer, not just a bmary 

speech/non speech decision, since with such a continuous fimction, the adaptive beamformer 

can afford to make evaluation mistakes. If with the binary criterion noise is erroneously 



PHNL031379EPP 



4 24.11.2003 
identified as speech, the beamfonner wiU start adapting fiilly to the noise and hence become 
non-optimal. A mechanism is needed with which in cases of enoneous adaptation of the 
beamfonner in response to incoming noise, the beamfomier is only adapted a Uttle in 
parameter space. This can be realized by making the ad^tation step dependent on the 

5 outcome of a function indicating how well the beamformer is optimized and how much noise 
is commg in, capable of making the beamformer non-optimal. These two factors together can 
be groiqied in an equation specifying a scale fector being a function Fl of a ratio of 
1) any variable indicative of the desired signal (e.g. speech) and derived from the 

first audio signal (e.g. the first audio signal itself or a further processed version thereof); and 

D 2) any variable indicative of the noise, i.e. dependent on the noise measurement, 

e.g. the signal from a second microphone. 

Note that the word derived should be read as comprising both explicit 
functional evaluation (e.g. by processing software code specifying a mathematical 
relationship), and signal processing resulting from the signal traversing circuitry (e.g. passmg 
> through a subtracter which eliminates noise). 

If this function is large, it indicates that the beamformer is doing its job rather 
well, and that it wiU probably also adapt well, so a large ad^tation step may be used, so that 
moving desired sound sources can be tracked. Vice versa, if the function indicates that the 
beamfonner is not or cannot be working weU (e.g. due to the presence of a strong interfering 
noise source), the adaptation step size should be made small, since the filtered sum 
beamformer filter coefBcients wiU not adapt to the correct values, but rather become even 
more wrong. The adaptation step is hence taken to be proportional to the scale Ifector. 

The adaptive beamfonner, or any of its embodiments, may be comprised in a 
sidelobe canceller, which further comprises an adaptive noise estimator, arraoged to derive an 
estimated noise signal y by filtering respective noise measurements xl, x2, x3 derived from 
the input audio signals with a second set of adaptable filters (gl, g2). In this sidelobe 
canceUer the subtracter is connected to subtractlhe estimated noise signal y from the first 
audio signal to obtam the noise cleaned second audio signal, and the scaling fiictor unit is 
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reliable noise estimate tiian a simple single noise measurement xl, provided of course that all 
filters are reasonably well adsQ)ted. 

The sidelobe canceling is working well if desired audio is inputted together 
with noise of a lype for which the sidelobe canceller is optimized to cancel it (i.e. a few 

5 correlated noise sources in directions for which the direction sensitivity pattern has zeroes), 
as contrasted to the sidelobe canceller working badly if the filters are not optimal (i.e. e.g. the 
main lobe is directed in between the direction of the desired sound source and a direction of a 
noise source) and/or there is uncorrelated noise. If the sidelobe canceller is mainly picking up 
the desired sound, it may adapt with a large adaptation step size, to be able to quickly track a 

10 moving desired source. If however the sidelobe cancellation is having problems staying 
focused on the desired sound source (e.g. because of interfering noise sources), it will 
probably become even worse with a large adaptation step size (especially if it is only slightiy 
misadapted), and hence the adaptation step size should be small. A similar rationale applies 
to the noise estimator/canceller, which is vice versa designed to adapt mainly to noise and not 

15 to the desired signal, e.g. speech. With such a continuous evaluation both the filtered sum 
beamformer and the noise estimator of the noise canceller can be adapted simultaneously if 
so desired, or each in its own complementary time intervals as with a prior art speech 
detector. 

A first embodiment of the adaptive beamformer or of the sidelobe canceller 
20 comprising such an adaptive beamformer has the coefScients of the first set of filters (and • 
preferably for the sidelobe canceller also the coefScients of the second set of filters) specified 
in the frequency domain, and is arranged for having the adaptation step size scaled per 
predetermined frequency range by the ratio (Q) being 

25 in which [/, t] is a measure of the power of the first audio signal z in the predetenmined 

jtequency range around frequency f and for a time instant t, Pa^jo^acxo !/» A ^ measure of 

the power of a noise signal derived firom at least one noise measurement xl by a 
transformation A, and C is a constant Instead of the power, also the amplitude or another 
fimction of the amplitude of the signals used in the ratio equation may be used. 
30 An appropriate and preferable transformation A for the sidelobe canceller is the 

transformation produced by applying the noise estimation filtering on the noise estimates xl, 
x2, x3, and yielding the estimated noise signal y. In that exemplary case t] reads 
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The filters may already be well adapted for most fiBquencies, but a noise in a 
particular frequency band may appear or move relative to the sidelobe canceUer. In this case 
only the coefficients in Hie particular frequency band need to be adapted. Hence preferred 
embodiments of the ad^ve beamformer/sidelobe canceller according to the invention will 
work with filters specified in the fisquency domain, although also time domain filters, or 
other representations may be used. In this first embodiment option die signal in the ratio 
equation being used as an estimate of the desired sound is the power of the first audio signal 
output by die beamformer. Instead of exactiy taking the output of the beamfonner, a number 
of elementary signal shaping operations may be pecfomtied before the first audio signal is 
taken to the scaling fector detemaining unit, e.g. since the noise estimation typically incurs an 
additional delay, a delay element is typicaUy introduced behind flie beamformer. It is then 
preferable to take the first audio signal after die delay, since this signal is in synchronization 
with die noise signal. If the sidelobe canceller is well adapted and there is Ktde noise present, 
then the noise power in die above equation is negligible compared to the desired sound 
power, making the numerator approximately equal to the denominator. If vice versa there is a 
lot of sound present, die numerator wiU be smaU compared to the drawminator, making die 
ratio smalL The above equation has values between zero and one, implying diat a suggested 
step size can be scaled between the suggestion and zero by simple multipUcation widi the 
above equatioa Whereas the beamformer filters are typically adjusted by scaling dieir 
adaptation step size with the evaluation result fixim die above equation, the noise 
estimator/cancellfir filters are typically scaled with 1 minus fliat evaluation result 

A second embodiment of the adaptive beamformer or of die sidelobe canceller 
comprising such an adaptive beamformer has the coefficients of the first set of filters 
specified in die fi»quency domain, and is and arranged for having die adaptation step size 
scaled per predetermined frequency range by the ratio (Q) being 

in which [/, /] is a measure of die power of die first audio signal (z) in die predetermined 

•frequency Kaigc2a:Qm3dfreqa5nc3.rf5ad for a time insts^ P^,,^,,t,.^[f,i] is a measure of 
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obtained dtSb^r subtracting residual noise from tbe first audio signal, it is supposed to be an 
even more accurate estimate of the desired audio signal. It is judged that a signal further in 
the processing line of algorithms for obtaining the desired signal forms a more accurate basis 
for a decision like e.g. whether the beamformer should adapt if the system is near optimum, 

5 but the resulting signal may also be far worse than an estimate obtained by a few simple 
algorithms if the sfdelobe canceller is far from optimum. Hence when using such a sidelobe 
canceller topology for updating the filters a classical speech detector may lead to totally 
unacceptable results and a continuous criterion for scaling the step sizes may be the only 
viable option. Similar equations, and equivalent sidelobe canceller updatmg topologies, may 

10 be derived for using signals obtained after further processing e.g. typically to further reduce 
the amount of residual noise, or to further clean up the desired sound or speech- as reference 
signal. 

It is advantageous if the adaptive beamformer/sidelobe canceller comprises a 
speech detector providing on the basis of llie first audio signal a Boolean designation 

1 5 Speech/Noise, and arranged to adapt only the first set of if the designation is Speech, and for 
the sidelobe canceller only the second set of filters if the designation is noise. The 
beamformer may then be arranged to only adq)t its filters - with the scaled adaptation step 
size- in case the desired sound is speech. 

It is also advantageous if the adaptive beamformer/sidelobe canceller is 

20 arranged to apply a binary decision function to the ratio, and arranged to adapt only the first 
set of filters if the decision is 1, and only the second set of filters if the decision is 0. E.g. 
values of either of the above two equations larger than 0.5 result in only the beamformer 
filters being updated, i.e. in a decision equaling 1, obtained in this example by rounding 
towards the nearest integer. Whereas a speech detector can only discriminate between speech 

25 and non speech noise —and often in an imreliable manner- using the ratio in a detector has the 
advantage that the sidelobe canceller can be used for locking onto all kinds of non speech 
desired sound, such as the sound of an animal like a singing bird, or a sound produced by an 
apparatus. 

~ The adaptive beainformer and sidelobe canceUer inay typically be applied in . 
30 all kinds of (e.g. typically handsfinee) speech communication devices, e.g. a pod for 

teleconferencing to be placed on a table, or a car kit, or regular mobile phone, personal digital 
assistant, dictation apparatuses or other device with similar communication capabilities. The 
adaptive beamformer/sidelobe canceller is also advantageous in a voice controlled apparatus, 
such as e.g. a remote control for a television, or a speech to text system on p.c, to improve 
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the speech identification capabilities of the apparatus, noise being an important problem for 
those devices. Other devices may be all kinds of consumer devices, elevators or parts of 
intelligent houses, security systems, e.g. systems relying on voice recognition, consumer 
interaction terminals, etc. 

The ^tem may also be used in a tracking device, typically used in security 
applications, or applications which monitor user behavior for some reason. An example may 
be a camera that zooms in on a burglar based on his characteristic noise. 
It is a second object of the invention to provide a method of sidelobe canceling corresponding 
to the functioning of the sidelobe canceller as described above. The second object is realized 
in that the method comprises: 

beamfonning filtering input audio signals firom an array of respective 
naicrophones with a first set of respective adaptable beamforming filters (fl(-t), f2(-t), Q(-t)), 
yielding a first audio signal predominantly corresponding to sound firom a desired audio 
source, the beamforming filtering being adaptive in the sense that coefiicients of the first set 
of adaptable filters are changed by adding to at least one coefScient a difiference value 
obtained as a function of an adaptation step size; 

deriving a noise measurement (xl) fi-om at least one of the input audio signals; 

subtracting the noise measurement fiiom tibe first audio signal (z) to obtain a 
noise cleaned second audio signal (r); 

determining a scale factor (S) as a first function (Fl), of a ratio (Q) of a first 
variable (F2) derived firom the first audio signal (z) and a second variable (F3) derived from 
the noise measurement (xl); and 

scaling the adaptation step size with the scale factor. 



These and other aspects of the sidelobe canceller according to the invention 
will be apparent from and elucidated with reference to the implementations and embodiments 
described hereinaftcar. and with jceference to the accompanjring drawings, which serve merely 
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la Fig. 1, sound ftom a desired sound source 160, and possibly also form one 
or more undesirable noise sources 161, travels to an array of at least two microphones 101, 

5 103, 105. The signals ul, u2, u3 output by these microphones are filtered by a first set of 

respective filters fl(-t), £2(-t), f3(-t) of a beamformer 107, the coefficients of which -typically 
a coefficient per band of firequencies- are adaptable to changing conditions in a room, e.g. of 
the desired sound source 160. The resulting signals outputted by the respective filters are 
summed by an adder 110, yielding a first audio signal z. Ideally die fiDiters represent the 

10 inverse paths of the desired sound towards a particular microphone, hence by filtering a fijcst 
microphone signal ul by the first filter fl(-t) ideally exactly the desired sound is obtained. 
Hence, if the filters are well adapted, the first audio signal z is a good approximation to the 
desired sound However, since the microphones also pick up noise, inevitably the first audio 
signal z also contains noise. The microphone signals ul, u2, u3 are also used to produce noise 

15 measurements xl, x2, x3. To obtain signals only rqpresentative of the noise, mathematically 
speaking orthogonal to the desired audio signal, the desired signal is subtracted from the 
mioxiphone signals ul, u2, u3 by respective subtracters 115, 121, 127. A so-called blocking 
matrix Ul therefore reappKes the sound traveling path filters fl,f2,f3 on the first audio, 
signal z, to obtain an estunate of the desired sound as picked up by the microphones. Hence 

20 thefiltersof the beamformer 107 and the blocking inatrix are similar apart fi» 

reversal. An adaptive noise estimator 150 estimates on the basis of the noise measurements 
xl, x2, x3, as obtained by each of the microphones, how much noise will be picked up in a 
main lobe of the beamformer directed towards the desired source or another part of the lobe 
pattem directed towards the desired sound, such as a sidelobe of that pattern, hence what the 

25 contribution is of the noise in the first audio signal z. The noise estimator 150 therefore has to 
^pply a second set of adaptable filters gl, g2, which are again related to the beamformer 
filters fl(-t), f2(-t), 0(-t). Because of mathematical dq)endency of one of the noise 
measurements xl, x2, x3 (there are only three microphone measurements leading to a desired 
audio signal being the first audio signal z and three noise measurements xl , x2, x3) before 

30 applying the second filters gl , g2, a dimension reduction may be applied. E.g. the thud noise 
signal may be dropped, or xl 1 may be defined as xl -(xl+x2+x3)/3 and xl2 may be defined 
as x2-(xl+x2-4^)/3, etc. 

Alternatively three second filths may be ad^ted, the converg^ce 
automatically taking care of the dependency. Finally a subtracter 142 is comprised for 
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subtracting the estiraated noise signal y from the jSrst audio signal z, the subtracter 142 and 
noise estimator 150 together constituting a noise canceller, yielding a second audio signal r, 
being relatively free of noise. 

The above described system is a sidelobe canceller as knomi from prior art. 

Respective beamformer update units 1 17, 123, 129 for updating the filters of 
the beamformer 107 and blocking matrix 1 1 1 are shown in Fig. 1 as forming part of the 
blocking matrix, although this need not be so. 

A typical update rule for a prior art beamformer may take the first audio signal 
z and a respective noise measurements as input and evaluate a new filter coefBcient for a 
particular fiequency range or band around frequency f: 

F(/,^+l) = i?'(/,0+y^^V,W,^] [Eq. 1] 

In this equation F is the particular filter coefBcient for a particular frequency 
range at discrete time tresp.t+-l, a is a constant, P„[/,r] is a measure of the power of the 
first audio signal, x is the respective noise measurement (e.g. xl for the first filter fl(-t)), and 
the star denotes complex conjugation. tLeace if the noise is approximately orthogonal to the 
desired first audio signal z; as it should be if the sidelobe canceller is optimized, the filter 
coefBcient is hardly updated. 

A typical update rule in a prior art noise canceller update unit 1 59 for updating 
the second set of filters gl, g2 is: 

GW,t+l) = G(f,t)+-f—r'[f,t}y[f,t] [Eq. 2], 

in which r is the second audio signal, and Fy^lfyt] is a measure of the power of the noise 
signal y. 

For the sidelobe canceller 100 according to the invention, these update steps 
(the part after the -H sign) are scaled depending on the ratio determining how well the sidelobe 
canceller works. 

Therefore a scaling fector det^msismig unit 1 70 m compiised^ which has as an 
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in which C is a predetemiined constaat, and the other terms have the same meaning as above. 
This function should be lower limit to zero, i.e. it should not become negative. It should be 
noted that the time instants may be chosen in different ways (known to the skilled person) 
and preferably the processing is done on a block basis. 
5 It can be shown that Eq. 3 is approximately equivalent to: 

S\ fM « ^ere A is the desired audio signal (e.g. speech 

of the desired speaker) and n is the noise, i.e. Eq. 3 is approximately equivalent to 

S\ fM « , i.e. a function of the signal to noise ratio 

1 0 The skilled person will realize that other estimates of the noise may also be 

used, hence the noise estimator of the sidelobe canceller is not required. Any combination of 
an adaptable filtered sum beamfbrmer (this concept also intended to corqpiise delay sum 
beamformers and sinoilar topologies) and a noise reference, e.g. the signal picked \xp by arry 
of the microphones, may be used to compose the core adaptive beamformer according to the 

IS invention. 

The scaling factor S is transmitted to the beamformer iqpdate units 1 17, 123, 
129 which are according to the invention arranged to scale the update step of the 
beamformer filters by multiplying the adaptation step size with the scaling fector S, yielding 
an updating rule according to the invention: 

20 -f 1) = FW,t) + <^:^^L/'>^^> z'[f,tWJ} [Eq.4] 

Similarly, by scaling the noise estimator filter adaptation step size with 1-S, 
the corresponding updating rule is: 

other functions of this ratio may be used provided that the noise estimator has 
25 a behavior inverse to the beamformer, i.e. the noise estimator predominantly reacts to signals 
containing mainly noise and little desired signal energy, e.g. picked up during speech pauses. 

As can be seen for e.g. the beamformer filter updating (Eq. 4), if there is a lot 
of (correlated or uncorrelated) noise present, then Ci^[/] is relatively large, making 



I 



PHNL031379EPP 

12 24.11,2003 
^«[/] ~ CPyyif] smaller than which results in a small step size. If there is no noise 

at all, the scaling factor is equal to one. 

A speech detector 1 65 as known from prior art may also be comprisecL It is 
modified to be able to output a signal Sufi to the beamformer update units 117, 123, 129 in 
5 case the first audio signal z is identified as speech, and the beamformer update units 1 17, 
123, 129 are arranged to only update the filters (fl(-t), f2(-t), fi(-t), fl, f2, £3) if the signal 
Sufi is of a particular value, e.g. 1 . Similarly a signal SUW enables the adaptation of the 
noise estimator 150 filters gl, g2, only in case the speech detector 165 identifies tiie first 
audio signal z as being noise. The speedi detection may also be applied to the second audio 

10 signal r as input Note that in Fig, 1 for clarity of the picture the connections of signals Sufi 
and SUW to the update unit are not shown, but the are understood to be of known kinds such 
as e.g. wiling, saving and fetching from memory in a sofiware version, etc. 

In a fiarther embodiment, the scaling &ctor determining unit 170 may 
comprise a sound type characterization unit 166. Similar to the speech detector 165 this vant 

15 identifies whether the sidelobe cancell^ is mainly locking on to the desired audio source or 
whether it is receiving a lot of noise. The sound type characterization unit 166 is e.g. 
arranged to apply a binary decision fiinction to the ratio Q (e.g. rounding to the nearest 
integer, 0 or 1), and is as above arranged to output a signal Sufi to adapt the first set of filters 
(fl(-t), f2(-t), e(-t) and also fl, f2, 0) only if the decision is 1, and the second set of filters 

20 (gl, g2) only if the decision is 0. This may increase the robustness of the sidelobe canceller 
100 even fiirther. 

Fig. 2 shows a topology for which is arranged to perform the updating of the 
beamforming/blocking filters (fl(-t), fZ(-t), B(-t), fl, fZ, £3) as a fimction of the second audio 
signal r. Therefore, second beamformer update units 219, 215, 21 1 are schematically shown 
25 above the prior art side canceller part as described before. The second beamformer update 
units 219, 215,211 have as second input a sinailarly constructed set of second noise 
measurements vl, v2, v3, which are constracted with respective subtracters, e.g. subtracter 
227 subtracting a filisrisd version of the second audio signal r ¥.dtfa a fii^ blocking fitoer fl 
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in which r is the second audio signal, v is one of llie second noise measurements vl , v2, v3 
corresponding to the particalar beamformer filter to be updated and [/] is a measure of 

the power of the second audio signal r. 

A possible equation for the scaling factor for this sidelobe canceller topology 
S 200, evaluated by a second scaling factor determining unit 250, is: 

The scaling of Ibe beamformer 107 filters, blocking matrix 111 filters, and 
noise estimator ISO filters is done as described for the topology of Fig. L 

If there is substantially only correlated noise and near perfect cancellation, the 
10 subtraction at subtracter 142 may be seen as a scalar equation, and by definition 

^rrlf^ ^«[/] ~ CPj^lf] , since r=z-y, making S approximately equal to 1. If the noise 

canceller is ill-adapted, e.g. due to movements of the noise source, since the phase of the 
noise is unknown the subtracter 142 can not perform a noise canceling. £.g. the anaplitude of 
the noise may be estimated correctiy, but if there is a phase difference of 1 80 degrees, the 
1 5 estimated noise signal y will be added to instead of subtracted fix>m the first audio signal, 

only increasing the noise. Also due to leakage of a lot of energy — even of the desired sound- 

'p 

in the noise measurements vl, v2, v3, the noise power will be relatively large. In 

summary, tiiis results to the fact that > P„[f} - CP^[f] , giving a scale factor smaller 

than one. Also for uncorrelated noise, the noise can not be subtracted firom the first audio 
20 signal z very well, resulting again in P„[f] > P^lf]- CP^[f] . 

The constant C may be determined in a number of ways. £.g., C may be 

determined as: 

C(f^t) = ^^^ [Eq-8], 

in which the Pzz is then determined during non speech time slices (i.e. the noise in z). This 
25 may be realized by means of a speech detector, or by looking for low amplitude regions in 

Ihe temporal z signal, the low amplitude occurring due to the absence of speech. It can be 

seen then that C*Pyy yields a good estimate of the noise in z. C may also be predetermined 

by optimization tests depending on the appKcation. 

The algorithmic components disclosed may in practice be (entirely or in part) 
30 realized as hardware (e.g. parts of an application specific IC) or as software running on a 

special digital signal processor, a generic processor, etc. 
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Under computer program product should be understood any physical 
realization of a collection of commands enabling a processor —generic or special purpose-, 
after a series of loading steps to get the commands into the processor, to execute any of the 
characteristic functions of an invention. In particular the computer program product may be 
reahzed as data on a carrier such as e.g. a disk or tape, data present in a memory, data 
traveling over a netvcrork connection —wired or wireless- , or program code on paper. Apart 
fix}m program code, characteristic data required for the program may also be embodied as a 
computer program product. 

It should be noted that the above-mentioned embodiments illustrate rather than 
limit the invention. Apart fixim combinations of elements of the invention as combined in the 
claims, otiier conibinations of the elements are possible. Any combination of elements can be 
realized in a single dedicated element 

Any reference sign between parentheses in the claim is not intended for 
limiting the claim. The word "comprising" does not exclude the presence of elements or 
aspects not listed in a claim. The word * V or "an" preceding an element does not exclude the 
presence of a plurality of such elements. 

The invention can be implemented by means of hardware or by means of 
software running on a processor. 
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CLAIMS: 



1 . An adaptive beamfonner, conqpiising: 

a filtered sum beamfonner ( 1 07) arranged to process input audio signals (ul , 
u2, u3) fiom an array of respective microphones (101, 103, 105), and arranged to yield as an 
output a first audio signal (z) predominantly corre^onding to sound fiom a desired audio 
5 source (160), by filtering whh a first set of respective adaptable filters (fl(-t), £2(-t), 6(-t)) 
the iiqmt audio signals (ul, u2, u3), the filtered sum beamfonner (107) being adaptive in the 
sense that coefScients of the first set of adaptable filters (fl(-t), £2(-t), 0(-t)) are susceptible 
to be changed by adding to at l^st one coefScient a difference value, obtained as a fimction 
of an adaptation step size; 
10 - a connection (199) for providing a noise measurement (xl) derivable firom at 

least one of the input audio signals (ul, u2, u3); 

a subtracter (142) to subtract the noise measvirement (xl) firom the first audio 
signal (z) to obtain a noise cleaned second audio signal (r); and 

a scaling factor determining unit (170), arranged to provide a scale &ctor (S) 
15 evaluated as a first fimction (Fl), of a ratio (Q) of a first variable (F2) derived firom the first 
audio signal (z) and a second variable (F3) derived fiom the noise measur^nent (xl), and 
arranged to scale the adaptation step size with the scale factor (S). 

2, A sidelobe canceller (100) comprising an adaptive beamformer as claimed in 
20 claim 1, finiher comprising: 

an adaptive noise estimator (150), arranged to derive an estimated noise signal 
(y) by filtering respective noise measurements (xl , x2, x3) derived from the input audio 
signals (ul, u2, u3) with a second set of adaptable filters (gl, g2); and 
. . - . . in M^iic^ the subtracter (142) is connected to si^ 

25 signal (y) fix)m the first audio signal (z) to obtain the noise cleaned second audio signal (r); 
and 

in which the scaling facbor unit is arranged to evaluate the scale factor (S) as 
the first fimction (Fl) of a ratio (Q) of the first variable (F2) derived from the fitrst audio 
signal (z) and a fourth variable derived &om the estimated noise signal (y). 
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3. An adaptive beamfonner as claimed in claim 1 or a sidelobe canceller as 
claimed in claim 2, having the coefficients of the first set of filters (fl(-t), f2(-t), e(-t)) 
specified in the fi:equency domain, and being arranged for having the adaptation step size 
scaled per predetermined Sceqaency range by the ratio (Q) being 

in which P„[fA is a measure of the power of Ifae first audio signal (z) in the predetermined 
frequency range around frequency f and for a time instant t, P^^^^^ [/, t] is a measure of 

the power of a noise signal derived from at least one noise measurement (xl) by a 
ttansfi>rmation A, and C is a constant 

4. An adaptive beamfonner as claimed in claim 1 or a sidelobe canceller as 
claimed in claim 2, having the coefficieats of the first set of filters (fl(-t), f2(-t), S(-t)) 
specified in Ihe frequency domain, and arranged for having the adaptation step size scaled per 
predetermined frequency range by the ratio (Q) being 

in which P„ [/, is a measure of the power of the first audio signal (z) in the predetermined 
fitjquency range around frequency f and for a time instant t, P^^^^^[f.t\ isameasureof 
the power of a noise signal derived from at least one noise measurement (xl) by a 
transformation A, P„[f,i\ is a measure of the power of the second audio signal (r), and C is 
a constant 



^' An adaptive beamfonner as claimed in claim 1 or a sidelobe canceller as 

claimed in claim 2, comprising a speech detector (165) providmg on the basis of the first 
audio signal (z) a Boolean designation Speech/Noise, and arranged to adapt the first set of 
filters (fl(-t), f2(-t), S(-t)) only if the designation is Speedi. 
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7. A handsfiee speech commurdcatioa device comprising an adaptive 
beamfoimer as claimed in claim 1 or a sidelobe canceller as claimed in claim 2.. 

8. A voice control unit comprising an adaptive beamfoimer as claimed in claim 1 
5 or a sidelobe canceller as claimed in claim 2. 

9. A consumer apparatus comprising a voice control imit as claimed in claim 8. 

10. A tracking device arranged for tracking an audio producing object, comprising 
10 an adaptive beamformer as claimed in claim 1 or a sidelobe canceller as claimed in claim 2. 

11. A method of adaptive beamforming, con^rising: 

beamforming filtering input audio signals (ul , u2, u3) &om an array of 
respective microphones (101, 103, 105) with a first set of respective adaptable beamforming 
15 filters (fl(-t), f2(-t), Q(-t)), yielding a first audio signal (z) predominantly corresponding to 
sound fcom a desired audio source (160), the beamforming filtering being adaptive in the 
sense that coefficients of the first set of adaptable filters (fl(-t), f2(-t), f3(-t)) are changed by 
adding to at least one coefficient a difference value obtained as a function of an adaptation 
step size; 

20 - deriving a noise measurement (xl) firom at least one of the input audio signals 

(ul,u2,u3); 

subtracting the noise measurement (xl) firom the first audio signal (z) to obtain 
a noise cleaned second audio signal (r); 

determining a scale &ctor (S) as a first fimction (Fl), of a ratio (Q) of a first 
25 variable (F2) derived fi-om the first audio signal (z) and a second variable (F3) derived from 
the noise measurement (xl); and 

scaling llie adaptation step size with the scale factor (S). 

12. A computer program product contprising code enabling a processor to execute 
30 the method of claim 1 1. 
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ABSTRACT: 



The relatively robust adaptive bearofonner, comprises: 
a SLtered sum beamformer (107) to process input audio signals (ul, u2, u3) 
from an array of respective microphones (101, 103, 105), and arranged to yield as an output a 
first audio signal (z) piedominantly corresponding to sound from a desired audio source 
5 (160); and 

an adaptive noise estimator (150), arranged to derive a noise signal (y) which 
is subtracted from the first audio signal (z) to obtain a noise cleaned second audio signal (r), 
and ftirther comprises a scaling fector determining unit (170) arranged to provide a scale 
fector (S) as a function of a ratio (Q) of the sidelobe canceling, and being arranged to scale 
10 the adaptation step size with the scale fector (S), so that the sidelobe canceller only adapts 
quickly if it is relatively well locked on the desired audio source, but is rather insensitive to 
interference from noise sources. 

» 

Fig. 1 
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